Latent perceptual mapping: a new acoustic modeling framework for speech recognition

نویسندگان

  • Shiva Sundaram
  • Jerome R. Bellegarda
چکیده

While hidden Markov modeling is still the dominant paradigm for speech recognition, in recent years there has been renewed interest in alternative, template-like approaches to acoustic modeling. Such methods sidestep usual HMM limitations as well as inherent issues with parametric statistical distributions, though typically at the expense of large amounts of memory and computing power. This paper introduces a new framework, dubbed latent perceptual mapping, which naturally leverages a reduced dimensionality description of the observations. This allows for a viable parsimonious template-like solution where models are closely aligned with perceived acoustic events. Context-independent phoneme classification experiments conducted on the TIMIT database suggest that latent perceptual mapping achieves results comparable to conventional acoustic modeling but at potentially significant savings in online costs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Grapheme-based Automatic Speech Recognition using Probabilistic Lexical Modeling

Automatic speech recognition (ASR) systems incorporate expert knowledge of language or the linguistic expertise through the use of phone pronunciation lexicon (or dictionary) where each word is associated with a sequence of phones. The creation of phone pronunciation lexicon for a new language or domain is costly as it requires linguistic expertise, and includes time and money. In this thesis, ...

متن کامل

Developing a perceptual appropriate framework for Iranian teacher’s dispatch to schools abroad

Purpose: The current research was done with employing comparative studies and its aim was developing a perceptual appropriate framework of Iranian teacher’s dispatch to schools abroad.  Methodology: The Iran's dispatch component and the selective countries, (included: America, France, Germany, Canada, England, Russia, Australia, Turkish, Japan, South Korean and Singapore) was identified and ext...

متن کامل

Effective Triphone Mapping for Acoustic Modeling in Speech Recognition

This paper presents effective triphone mapping for acoustic models training in automatic speech recognition, which allows the synthesis of unseen triphones. The description of this data-driven model clustering, including experiments performed using 350 hours of a Slovak audio database of mixed read and spontaneous speech, are presented. The proposed technique is compared with treebased state ty...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010